{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Sumologic - Data Connector"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Description\n",
    "The data provider module of msticpy provides functions to allow for the defining of data sources, connectors to them and queries for them as well as the ability to return query result from the defined data sources. \n",
    "\n",
    "For more information on Data Propviders, check documentation\n",
    "- Data Provider: https://msticpy.readthedocs.io/en/latest/data_acquisition/DataProviders.html\n",
    "\n",
    "In this notebooks we will demonstrate Sumologic data connector feature of msticpy. \n",
    "This feature is built on-top of the [Sumologic SDK for Python] (https://github.com/SumoLogic/sumologic-python-sdk) with some customizations and enhancements."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Installation"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-08-07T19:14:41.244954Z",
     "start_time": "2020-08-07T19:14:41.241519Z"
    }
   },
   "outputs": [],
   "source": [
    "# Only run first time to install/upgrade msticpy to latest version\n",
    "# %pip install --upgrade msticpy"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Authentication\n",
    "\n",
    "Authentication for the Sumologic data provider is handled by specifying credentials (accessid and accesskey) directly in the connect call or specifying the credentials in msticpy config file.\n",
    "\n",
    "For more information on how to create credentials, follow Sumologic Docs [Access Keys](https://help.sumologic.com/Manage/Security/Access-Keys) and [Users and Roles](https://help.sumologic.com/Manage/Users-and-Roles). The user should have permission to at least run its own searches or more depending upon the actions to be performed by user.\n",
    "\n",
    "Once you created user account with the appropriate roles, you will require the following details to specify while connecting\n",
    "- connection_str = \"https://api.us2.sumologic.com/api\" (Sumologic url endpoint depending on which region is used)\n",
    "- accessid = \"xxx\" (as created in Sumologic user preferences)\n",
    "- accesskey = \"xxx\" (same)\n",
    "\n",
    "Once you have details, you can specify it in `msticpyconfig.yaml` as shown in below example"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-08-07T17:50:18.361039Z",
     "start_time": "2020-08-07T17:50:18.349006Z"
    }
   },
   "source": [
    "```\n",
    "DataProviders:\n",
    "  Sumologic:\n",
    "    Args:\n",
    "      connection_str: \"{Sumologic url endpoint}\"\n",
    "      accessid: \"{accessid with search permissions to connect}\"\n",
    "      accesskey: \"{accesskey of the user specified}\"\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-08-07T19:14:46.801520Z",
     "start_time": "2020-08-07T19:14:44.889959Z"
    }
   },
   "outputs": [],
   "source": [
    "# Check we are running Python 3.6\n",
    "import sys\n",
    "\n",
    "MIN_REQ_PYTHON = (3, 6)\n",
    "if sys.version_info < MIN_REQ_PYTHON:\n",
    "    print(\"Check the Kernel->Change Kernel menu and ensure that Python 3.6\")\n",
    "    print(\"or later is selected as the active kernel.\")\n",
    "    sys.exit(\"Python %s.%s or later is required.\\n\" % MIN_REQ_PYTHON)\n",
    "\n",
    "# imports\n",
    "import pandas as pd\n",
    "from datetime import datetime, timedelta\n",
    "\n",
    "# data library imports\n",
    "from msticpy.data.data_providers import QueryProvider\n",
    "\n",
    "print(\"Imports Complete\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# use custom config file?\n",
    "# import os\n",
    "# os.environ['MSTICPYCONFIG'] = '/path/to/msticpyconfig.yaml'"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "## Instantiating a query provider\n",
    "\n",
    "You can instantiate a data provider for Sumologic by specifying the credentials in connect or in msticpy config file. \n",
    "<br> If the details are correct and authentication is successful, it will show connected.\n",
    "\n",
    "URL endpoints are referenced on [Sumo Logic Endpoints and Firewall Security](https://help.sumologic.com/APIs/General-API-Information/Sumo-Logic-Endpoints-and-Firewall-Security)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-08-07T19:15:09.229632Z",
     "start_time": "2020-08-07T19:15:08.536426Z"
    },
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "sumologic_prov = QueryProvider(\"Sumologic\")\n",
    "# sumologic_prov.connect(connection_str=<url>, accessid=<accessid>, accesskey=<accesskey>)\n",
    "sumologic_prov.connect()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Running a Ad-hoc Sumologic query\n",
    "You can define your own sumologic query and run it via sumologic provider via `QUERY_PROVIDER.exec_query(<queryname>)`\n",
    "\n",
    "For more information, check documentation [Running and Ad-hoc Query](https://msticpy.readthedocs.io/en/latest/data_acquisition/DataProviders.html#running-an-ad-hoc-query)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "sumologic_query = \"\"\"\n",
    "*\n",
    "| formatDate(_messageTime,\"yyyy/dd/MM HH:mm:ss\") as date\n",
    "| first(date), last(date) by _sourceCategory\n",
    "| count _sourceCategory,_first,_last\n",
    "| sort -_count\n",
    "\"\"\"\n",
    "df = sumologic_prov.exec_query(sumologic_query, days=0.2, verbosity=3)\n",
    "df.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "sumologic_query = \"\"\"_index=WINDOWS | count _sourceCategory,hostname\"\"\"\n",
    "df = sumologic_prov.exec_query(\n",
    "    sumologic_query,\n",
    "    start_time=datetime.now() - timedelta(days=6.001),\n",
    "    end_time=datetime.now() - timedelta(days=6),\n",
    ")\n",
    "df.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "sumologic_query = \"\"\"_index=WINDOWS | timeslice 1d | count _sourceCategory,_timeslice\"\"\"\n",
    "df = sumologic_prov.exec_query(\n",
    "    sumologic_query,\n",
    "    start_time=datetime.now() - timedelta(days=3),\n",
    "    end_time=datetime.now(),\n",
    ")\n",
    "df.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# map._count should be int64, map._timeslice timestamp, others object type.\n",
    "df.dtypes"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "sumologic_query = \"\"\"_index=WINDOWS\"\"\"\n",
    "df = sumologic_prov.exec_query(\n",
    "    sumologic_query,\n",
    "    start_time=datetime.now() - timedelta(days=1),\n",
    "    end_time=datetime.now(),\n",
    "    limit=10,\n",
    "    verbosity=1,\n",
    "    numeric_columns=[\"map.eventid\"],\n",
    ")\n",
    "df.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# time_columns, _receipttime and _messagetime should be datetime64 type\n",
    "# eventid should be int64\n",
    "df.dtypes"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Test paging (per page limit = 10k), should return 11000 results\n",
    "sumologic_query = \"\"\"_index=WINDOWS | count _messageid\"\"\"\n",
    "df = sumologic_prov.exec_query(\n",
    "    sumologic_query,\n",
    "    start_time=datetime.now() - timedelta(hours=1),\n",
    "    end_time=datetime.now(),\n",
    "    limit=11000,\n",
    "    verbosity=3,\n",
    ")\n",
    "df.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Test paging (per page limit = 10k), should return all results\n",
    "sumologic_query = \"\"\"_index=WINDOWS | count _messageid\"\"\"\n",
    "df = sumologic_prov.exec_query(\n",
    "    sumologic_query,\n",
    "    start_time=datetime.now() - timedelta(hours=1),\n",
    "    end_time=datetime.now(),\n",
    "    verbosity=2,\n",
    ")\n",
    "df.shape"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## References\n",
    "\n",
    "- Sumologic SDK for Python: https://github.com/SumoLogic/sumologic-python-sdk\n",
    "- Sumologic Community : https://support.sumologic.com/hc/en-us/community/topics\n",
    "- Sumologic Documentation: https://help.sumologic.com/"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "hide_input": false,
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.13"
  },
  "toc": {
   "base_numbering": 1,
   "nav_menu": {},
   "number_sections": false,
   "sideBar": true,
   "skip_h1_title": false,
   "title_cell": "Table of Contents",
   "title_sidebar": "Contents",
   "toc_cell": false,
   "toc_position": {
    "height": "calc(100% - 180px)",
    "left": "10px",
    "top": "150px",
    "width": "185.554px"
   },
   "toc_section_display": true,
   "toc_window_display": true
  },
  "varInspector": {
   "cols": {
    "lenName": 16,
    "lenType": 16,
    "lenVar": 40
   },
   "kernels_config": {
    "python": {
     "delete_cmd_postfix": "",
     "delete_cmd_prefix": "del ",
     "library": "var_list.py",
     "varRefreshCmd": "print(var_dic_list())"
    },
    "r": {
     "delete_cmd_postfix": ") ",
     "delete_cmd_prefix": "rm(",
     "library": "var_list.r",
     "varRefreshCmd": "cat(var_dic_list()) "
    }
   },
   "types_to_exclude": [
    "module",
    "function",
    "builtin_function_or_method",
    "instance",
    "_Feature"
   ],
   "window_display": false
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
